Bidirectional Modelling for Short Duration Language Identification
نویسندگان
چکیده
Language identification (LID) systems typically employ ivectors as fixed length representations of utterances. However, it may not be possible to reliably estimate i-vectors from short utterances, which in turn could lead to reduced language identification accuracy. Recently, Long Short Term Memory networks (LSTMs) have been shown to better model short utterances in the context of language identification. This paper explores the use of bidirectional LSTMs for language identification with the aim of modelling temporal dependencies between past and future frame based features in short utterances. Specifically, an end-to-end system for short duration language identification employing bidirectional LSTM models of utterances is proposed. Evaluations on both NIST 2007 and 2015 LRE show state-of-the-art performance.
منابع مشابه
LanideNN: Multilingual Language Identification on Character Window
In language identification, a common first step in natural language processing, we want to automatically determine the language of some input text. Monolingual language identification assumes that the given document is written in one language. In multilingual language identification, the document is usually in two or three languages and we just want their names. We aim one step further and prop...
متن کاملRhythmic unit extraction and modelling for automatic language identification
This paper deals with an approach to automatic language identification based on rhythmic modelling. Beside phonetics and phonotactics, rhythm is actually one of the most promising features to be considered for language identification, even if its extraction and modelling are not a straightforward issue. Actually, one of the main problems to address is what to model. In this paper, an algorithm ...
متن کاملModified-prior i-vector estimation for language identification of short duration utterances
In this paper, we address the problem of Language Identification (LID) on short duration segments. Current state-of-the-art LID systems typically employ total variability i-Vector modeling for obtaining fixed length representation of utterances. However, when the utterances are short, only a small amount of data is available, and the estimated i-Vector representation will consequently exhibit s...
متن کاملFrame-by-frame language identification in short utterances using deep neural networks
This work addresses the use of deep neural networks (DNNs) in automatic language identification (LID) focused on short test utterances. Motivated by their recent success in acoustic modelling for speech recognition, we adapt DNNs to the problem of identifying the language in a given utterance from the short-term acoustic features. We show how DNNs are particularly suitable to perform LID in rea...
متن کاملA Step Beyond Local Observations with a Dialog Aware Bidirectional GRU Network for Spoken Language Understanding
Architectures of Recurrent Neural Networks (RNN) recently become a very popular choice for Spoken Language Understanding (SLU) problems; however, they represent a big family of different architectures that can furthermore be combined to form more complex neural networks. In this work, we compare different recurrent networks, such as simple Recurrent Neural Networks (RNN), Long Short-Term Memory...
متن کامل